New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import speedup - Ticket #2181 #334
Conversation
Currently there are 5 places under numpy/polynomial which do something like: exec polytemplate.substitute(name='Hermite', nick='herm', domain='[-1,1]') I have changed those so there are start/end markers, like ... I have edited polytemplate.py so that it implements a __main__ which finds those block markers and replaces the content with the appropriate "polytemplate" substitution. This means that if you edit the template then you will need to manually run 'polytemplate.py' so that the dependent files are rebuilt based on the new template. The performance of python -c 'import time; t1=time.time(); import numpy; print time.time()-t1' goes from 0.079 seconds to 0.057 seconds (best of 10), for an import speedup of ~25%.
By the way, all tests pass on my 64-bit Ubuntu machine for py24 through py32 using tox. Looks like we got the bad Travis worker machine.. |
Yes, that's an unrelated failure. Not requiring a manual action would indeed be better I think. But I'm okay with whatever option @charris prefers. |
Looks like a lot of this is a style cleanup that should probably come in as a separate pull request. On the template, I have mixed feelings. The speed up isn't large and the added amount of code is significant. That said, I was never really happy with the template as is. And I'm not really happy with the current proposal either ;) It would be ideal if the files could be rewritten during the build phase, but that begins to become complicated... I'd like to think about possible solutions a bit more. Any other ideas? Note the main problem I had was the type specific documentation, otherwise the template could have been implemented as a mixin class. |
Rewriting at build time doesn't seem all that complicated; it's what I had in mind in the first place. Whether to do it like in this PR or by simply doing the current To be more concrete, what I was thinking was to rename the current files to an underscored version, i.e.
|
That seems reasonable and should be flexible enough that the template is still usable for other polynomial types. Not sure how this looks in Bento or Numscons. |
Although the polynomial package will no longer be stand alone. Not sure that that matters. |
Well, it's still stand alone in that it doesn't depend on the rest of Numpy. Don't think it matters much. |
I agree with Chuck that a lot of this diff is white-space and other style changes which should be separate from this PR. My $0.02 is that re-writing at build-time is not any more complicated than we are doing elsewhere and if it helps import times, I'm supportive. Everything I've heard from David tells me that this won't affect Bento at all, but David would know more about that. |
I've reverted the commit with the style clean ups and implemented build time generation of the polynomial class modules. My changes shouldn't cause any problems with the Bento build, but I can't confirm that because the Bento build is currently broken (unrelated). |
Actually, my bad on the Bento build. I think that a bscript implementing the build time generation will be required... |
I just removed import polynomial from numpy/init.py and things work just fine from my point of view. |
base_path = os.path.join('numpy', 'polynomial') | ||
|
||
fp = open(os.path.join(base_path, base_file)) | ||
lines = fp.readlines() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not fp.read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No good reason. That would be better.
Removing
works. |
True, but I wonder if it's worth it to have it at that level. I put it there originally because that was how other packages were treated, but I wonder if requiring
Is too much to ask? If we refuse to make these small changes here and there it's hard to get things cleaned up. |
Indeed, it's hard to clean up. The cost of backwards compatibility I'm afraid. I agree it would have been better to not import it by default. I'm all for more and smaller namespaces anyway (I like |
Sigh. OK, the easy solution to import times isn't available. Last time I tried to deprecate imports didn't work that well eithe, I couldn't make the deprecation messages show up reliably. I was guessing that the package wasn't used that much until recently due to the missing documentation, but I think that is starting to change... |
Maybe the lazy import ideas would also be a good way to deprecate imports. |
Please no. I wasted so much time hunting down bugs and then removing the lazy imports from scipy. I think this PR is a much better solution. |
I was thinking 'and', not exclusive 'or' ;) However, it might be useful to have a way to reliably deprecate imports down the road. |
That would be nice. But still I'd rather not have that than have lazy imports. It will break the most random stuff (like py2exe maybe). |
(And for the record, I'd like a better explanation of why lazy importing won't work before we go breaking the API. I understand it caused problems in Scipy, and sure, likely it will cause problems here. But I can't tell how many of those problems were intrinsic to lazy importing, and how many were caused by _import_tools being incredibly convoluted and over-complicated. Certainly both tab-completion and trac bug #1501 seem to be _import_tools-specific problems. It's true py2exe would need some hand-holding to find numpy libraries, but py2exe needs hand-holding every time you sneeze, and has a whole wiki full of examples of how to do it.) |
On second thought, this is the wrong place for that discussion anyway. I suggest we leave the 'import .polynomial" out of this PR and take that discussion to the list? |
@njsmith Hmmm, something odd on my machine then. No errors reported by Tox, but when I install using Python3.2 directly I get the same failure reported by Travisbot: Traceback (most recent call last): |
if sys.version_info[0] >= 3: | ||
rel_import = "from . import" | ||
else: | ||
rel_import = "import" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bet it's the lack of this code that's causing the import problem on Py3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was also my first thought, but I don't understand why and won't be
able to get back to it until the middle of next week.
On Jul 13, 2012 6:09 PM, "njsmith" <
reply@reply.github.com>
wrote:
"""
import string-import sys
-if sys.version_info[0] >= 3:
- rel_import = "from . import"
-else:- rel_import = "import"
I bet it's the lack of this code that's causing the import problem on Py3.
Reply to this email directly or view it on GitHub:
https://github.com/numpy/numpy/pull/334/files#r1159923
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tl;dr We're confusing 2to3. Py3 builds of this branch against Git result in failure. Py3 builds against a source dist work fine. Regenerate numpy/polynomial/__init__.py
at build time for Py3, or use explicit imports everywhere.
I've just had another look at this. This code removal is unrelated to the Py3 errors. The problem is related to the 2to3 tool.
When a Py3 build is run directly from the repository, the 2to3 tool doesn't refactor the relative imports (e.g. from polynomial import Polynomial
) in numpy/polynomial/__init__.py
(presumably because the files don't exist at the time 2to3 is run and the tool must therefore assume they are meant to be external imports).
When a Py3 build is run from a source distribution, the numpy/polynomial/polynomial.py
etc. files already exist and the relative imports can be correctly generated in numpy/polynomial/__init__.py
. This explains why testing with tox works on my machine - the build and test cycle for each Py version is based on a single sdist created by the system python and cached at ~/.tox/distshare/
Travis CI farms out the build tasks and runs the build and test for each Py version against a fresh Git checkout - hence the reported Py3 failures.
To fix, numpy/polynomial/__init__.py
also needs to be regenerated at build time for Py3, or use explicit imports from numpy.polynomial.polynomial import ...
etc. everywhere.
@njsmith: unless I'm completely missing the point here, we're not breaking the API here. All we do is reducing import time, by removing an expensive |
|
Looks fine to me now. I'll leave @charris make the final determination though. |
I'm fine with putting this in, but I think the numscons build should work first. We can probably leave the bento build to David. I suspect David is travelling at the moment... @cournape David, what do we need to do for numscons and bento? |
For Bento - I now think that a I have zero understanding of Numscons. |
I'll take a look a numscons if David doesn't chime in. |
The install puts the generated files in numpy/polynomial instead of in the build directory, that needs to be fixed. |
Okay. I'll change that. The current method works, but I see the benefit of being able to revert to |
Thanks. Looks like it could be a bit of work to figure out where everything goes what with all the different directories that accumulate under build for the different the python versions etc. Somewhere the path gets passed down from the top but I'm not sure how. I've avoided dealing with build stuff over the years. |
Not tested on Windows/Mac etc.. |
I think this needs to be brought to completion. I'll take a look this weekend. |
@scottza needs a rebase. Much has changed since the original PR. |
@scottza In fact, I suggest you redo this PR starting with current master. That is probably the easiest way to clean up the history, and now that you know what you are doing it probably isn't that much work.. |
I'll take a look... On 10 August 2013 21:07, Charles Harris notifications@github.com wrote:
|
refactor: Simplify vaddw_* implementations
Applies the patch submitted at http://projects.scipy.org/numpy/ticket/2181
see mailing list discussion:
http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063130.html
especially - http://mail.scipy.org/pipermail/numpy-discussion/2012-July/063194.html - should polytemplate.py be run at build time instead of requiring it to be run manually when edited.